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Abstract 


Long non-coding RNAs (IncRNAs) compose a plentiful category of transcripts that have gained increasing 
importance because of their roles in different biological processes. Although the function of most IncRNAs remains 
unclear. They are implicated in epigenetic regulation of gene expression, including muscle development and 
differentiation. We aimed to identify the effect of novel IncRNAs (Alternatively spliced) and their target genes on 
two stages of sheep skeletal muscle growth and development. FastQC files have been used to examine the quality 
control and the Trimmomatic program for trimming low-quality reads from twelve longissimus dorsal muscle tissue 
samples (including six young and six old from Texel sheep). Hisat2, Cufflink, Cuffmerge, and Cuffdiff investigated 
the expression levels. Novel IncRNAs (Alternative spliced) were distinguished using NONCODE databases and 
Cuffcompare software. In addition, the IncRNA—mRNA interactions and regulatory network visualization were 
identified via RIsearch and Cytoscape software, respectively. Those 139 novel IncRNA (Alternative spliced) 
transcripts had been recognized, probably 65 IncRNAs interacted with their target genes and regulated sheep skeletal 
muscle growth and development. Three novel IncRNA transcripts (TCONS_00041386, TCONS_00050059, and 
TCONS_00056428) showed a strong association and five transcripts (TCONS_00055761, TCONS_00055762, 
TCONS_00055763, TCONS_00055764, and TCONS_00055770) had made complex network correlations with 
mRNAs. Our research provided more knowledge of the associated mechanisms with novel IncRNAs, which could 


play a role in regulating sheep skeletal muscle tissue development and growth. 
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Introduction 


Enhanced knowledge of the myogenesis 
molecular mechanisms of the livestock (especially 
ram and lamb) may help raise meat production 
(Relaix and Zammit, 2012, Rashidian et al., 2020). 
Texel is a breed of domestic sheep that originates 
from the island of Texel in the Netherlands. It is a 
well-muscled sheep, produces a lean meat carcass, 
and will pass on this quality to its offspring. It is 
currently the popular sheep in Europe, Australia, 
New Zealand, and the United States. The most 
notable feature of Texel is its significant muscle 
growth and mass (double muscling phenotype) 
(Clark et al., 2018). Texel Lambs show the 
advantage of having a full leg score among breed 
comparisons and less total carcass fat especially 
contact fat (Kinka and Young, 2019). 
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RNA-seq (RNA-sequencing) is a technique that 
can examine RNA quantity and sequences via next- 
generation sequencing (NGS). It analyzes the 
transcriptome profile of the cells (or gene expression 
patterns) in different groups or treatments to 
understand the related biological processes, such as 
skeletal muscle growth and development (Badday 
betti et al., 2022). Some genes can have several 
promoter regions a tissue-specific expression pattern 
(Ghanipoor-Samami and Javadmanesh, 2018). 
Alternative spliced (class-code”j”) is a multi-exon 
with at least one junction match (Pertea and Pertea, 
2020). Non-coding RNAs less than 200 bp are 
known as small ncRNAs and consist of small nuclear 
RNAs (snRNA), ribosomal RNA (rRNA), transfer 
RNA (tRNA), small nucleolar RNAs (snoRNA), 
Piwi-interacting RNAs (piRNAs), small interfering 
RNAs (siRNAs) and microRNAs (miRNAs). Non- 
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coding RNAs with more than 200 bp length have 
been classified as long non-coding RNAs 
(Neguembor et al., 2014). The gene expression 
profiling and in situ hybridization investigations 
have discovered that IncRNA_ expression is 
developmentally controlled, can be cell- and tissue- 
type specific, and can differ temporally, spatially, or 
in response to the stimulants (Derrien et al., 2012). 
To date, only some IncRNAs have been identified in 
detail (Badday Betti et al., 2022). 

Nevertheless, IncRNAs are believed to have a broad 
range of cellular and developmental functions and 
have been introduced as significant gene expression 
regulators. LncRNAs may perform either gene 
expression inhibition or activation via different 
mechanisms, complicating our understanding of 
genomic regulation. It is predestined that 25 — 40% 
of coding genes have overlapping antisense 
transcriptions. LncRNAs_ are differentially 
expressed through three developmental muscle 
periods in sheep, which might have vital functional 
roles in myogenic differentiation (Chao et al., 2016). 
LncRNAs such as H19 play a role in multiple 
biological processes, including negative regulation 
of body weight, cell proliferation, and embryonic 
growth control (Gabory et al., 2009; Gabory et al., 
2010). LncRNAs have different classifications based 
on gene conservation or functions and play a role in 
chromatin modeling and genomic localization 
(Ramakrishnaiah et al., 2020). LncRNAs can 
regulate muscle growth and differentiation through 
cis-regulatory, trans-regulatory, or competitive 
endogenous RNAs, indicating that IncRNAs could 
be important muscle growth regulatory factors and 
potential valuable molecular marker regions for 
mutton sheep breeding (Ballarino et al., 2015). 
Ovine IncRNAs may be involved in skeletal muscle 
development in Texel and Ujumaqin. These results 
revealed that IncRNAs like TCONS_00044801, 
TCONS_00008482, and TCONS_00102859 
participate in muscle development (Li et al., 2018). 
A total of 39 differentially expressed IncRNAs were 
detected in mutton sheep. Subsequent bioinformatics 
analyses revealed that 29 IncRNAs were associated 
with muscle development, metabolism, cell 
proliferation, and apoptosis. Six IncRNAs noticed as 
hub IncRNAs, and four IncRNAs showed potential 
regulatory relationships (Chao et al., 2018). 
Consequently, in our study, we aimed to identify the 
effect of novel IncRNAs (Alternative spliced) and 
their target genes to improve knowledge of their 
roles in sheep skeletal muscle growth and 
development at early and adult stages. Also, it might 
provide a vision about the regulatory genes and put 
the foundation for selection programs to improve the 


http://jcmr.um.ac.ir 


Novel IncRNA in Sheep (Betti et al.) 


meat production policies in sheep. 
Materials and Methods 


Data Collection 

We retrieved the RNA raw reads (paired-end) 
with accession numbers ERR4891 and ERR4892 for 
analysis based on the Ensemble database. Twelve 
samples represented at two diverse functional stages: 
six muscle tissue (longissimus dorsal) samples 
(ERR489116_1_fastqc to ERR489121_2 fastqc) of 
Texel juvenile 6-10 months, and six samples 
(ERR489188_1_fastqc to ERR489189_2_fastqc and 
from ERR489242_1_fastqe to 
ERR489245_2 fastqc) for adults (above one year). 
The sanger/IIlumina 1.9 platform had been used for 
sequencing these samples. Sheep reference genome 
(Oar_v3.1) and annotated the GITF file 
(Oar_v3.1.96) based on the Ensemble database had 
been downloaded. 


RNA Sequences Data Analysis 


141079400 raw reads were generated by Illumina 
1.9 platform with 2x151 bp paired-end reads before 
trimming was used, and the quality was analyzed by 
FastQC (v0.11.5) software (Andrews, 2010). 
Accordingly, we obtained clean data by removing 
contamination reads, including bases below quality 
reads, adapters, and low-quality reads _ using 
Trimmomatic software (v.0.36) (Bolger et al., 2014). 
Finally, reads with a minimum Phred quality score 
of 20 and 36 bp as a minimum length retained. 


LncRNA Identification Pipeline 

We used the pipeline in Figure Sl (in the 
supplementary part) to distinguish the candidate 
IncRNAs. The sheep genome (Oar_v3.1) was 
indexed in the first step, and the clean reads were 
aligned via Hisat2-build (Kim et al.,2015). We 
converted output SAM files from Hisat2 software to 
BAM files and sorted all BAM files using Samtools 
software v.1.9 (Li et al.,2009). The mapped reads 
had assembled using Cufflinks software 
(v2.2.1.0SX_x86_64) (Trapnell et al., 2010). All 
GTF files for 12 samples have merged, and the 
expression level of transcripts (Alternative spliced) 
had detected by using Cuffmerge software(v2.2.1) 
(Trapnell et al., 2012). 

In the second step, candidate transcripts were 
identified with class code (Alternative spliced = j) 
using Cuffcompare software (v2.2.1) (Trapnell et al., 
2010), based on related annotation GTF file 
(Oar_v3.1.96). The pipeline with the following 
criteria used to detect the potential novel IncRNAs 
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(Alternative spliced): 


1-The transcripts with length > 200 bp retained. 
2-The transcripts with exon number < 2 had been 
removed. 

3-The protein-coding potential of transcripts was 
filtered using two tools CPC2 (score > 0.5) (Kang et 
al., 2017) and PLEK (score > 0) (Li et al., 2014). The 
transcripts predicted as non-coding potential using 
the two tools above have remained. 

4-We predicted the open reading frame (ORF) using 
the TransDecoder tool (v5.5.0) 
(https://transdecoder.github.io/), and ORF < 300 aa 
retained. 

5-Novel IncRNAs (Alternative spliced) were 
distinguished using NONCODE databases (v.5) 
(Fang et al., 2018) http://www.noncode.org/. 


Interactions of Novel IncRNAs with Potential 
Target Genes 

The mRNA stability can be modulated by 
IncRNA (Ramakrishnaiah et al. 2020). We 
considered each “gene biotype = mRNA” group in 
the annotation GTF file (Oar_v3.1.96) as potential 
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target mRNAs. Interaction between IncRNAs— 
mRNAs had performed to investigate the target 
mRNA genes, which may organize by the novel 
IncRNAs using the RIsearch software (RNA 
Interaction search) (Wenzel et al., 2012) under base- 
pairing free energy no more than -50 agreement with 
(Yuan et al., 2020). 


Construction of ncRNA-mRNA Network 

Cytoscape software v.3.7.2 (Shannon et al., 2003) 
had been used to visualize the IncRNAs-mRNAs 
regulatory network 


Result 


RNA Sequencing and Mapping 

Possible functional IncRNAs in _ parallel to 
alternatively spliced novel IncRNAs that may 
implicate the growth and development of sheep 
skeletal muscle tissue had been identified. Results 
for quality control, trimming, and mapping for 12 
samples had been mentioned in (Table 1). 


Table 1. The results for Quality control trimming and mapping had gained per 12 samples 


Raw reads Raw reads GC Aligned Overall 
Name of samples stage before after content concordantly | alignment 
trimming trimming (%) exactly % rate % 

ERR489116_1_fastqe 

/ young 8647125 6622025 44 42.20 83.14 
ERR489116_2_fastqc 
ERR489117_1_fastqe 

/ young 13368944 10132560 45 51.08 94.46 
ERR489117_2_fastqc 
ERR489118_1_fastqe 

/ young 13756065 10454415 45 51.17 94.38 
ERR489118_ 2 fastqce 
ERR489119_1_fastqe 

/ young 8685734 6641245 44 43.72 84.72 
ERR489119_2_fastqce 
ERR489120_1_fastqe 

/ young 13374250 10156487 45 51.11 94.52 
ERR489120_2_fastqc 
ERR489121_1_fastqe 

/ young 13831915 10539372 45 50.79 94.49 
ERR489121_2_fastqc 
ERR489188_1_fastqce 

/ adult 19697023 15023092 46 63.77 94.04 
ERR489188_2_fastqc 
ERR489189_1_fastqce 

/ adult 19792913 10350142 46 63.94 94.22 
ERR489189_2_fastqc 
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ERR489242_1_fastqce 
/ 
ERR489242_2_fastqc 


adult 9016506 


7220274 


45 50.03 91.87 


ERR489243_1_fastqce 
/ 
ERR489243_2_fastqc 


adult 5771315 


4633682 


45 48.80 89.71 


ERR489244_1_fastqc 
/ 
ERR489244 2 fastqc 


adult 9285314 


7445816 


45 50.25 91.85 


ERR489245_1_fastqc 
/ 
ERR489245_2_fastqc 


adult 5852296 


4703164 


45 48.11 89.99 


Total raw reads before trimming =141079400 

Total raw reads after trimming = 103922274 

Total of low-quality reads was removed = 37157126 
Average aligned concordantly exactly = 51.2475% 
Average overall alignment rate = 91.44917% 
Average GC content = 45% 


Identification of Novel IncRNAs in Skeletal 
Muscle of Sheep 


Five criteria had used to distinguish the novel 
IncRNAs (Alternative spliced) using pipeline 
(Figure $1) at the two stages (young and adult) of 
skeletal muscle tissue in sheep. First, we filtered 
649300 transcripts with class code = j (Alternative 
spliced) using Cuffcompare software. The remaining 
transcripts (368099) had classified as class code = j 
(Alternative spliced). Approximately 56% of 
assembly transcripts were discarded by choosing 
class code = j. 358072 transcripts had removed by 
submitting the retrained transcripts (368099) for 
filtering length < 200bp and exon number < 2. A 
total of 10027 remaining transcripts had subjected to 
predict coding potential using two tools CPC2 (score 
> 0.5) (Kang et al., 2017) and PLEK (score > 0) (Li 
et al. 2014). We obtained 2910 coding and non- 
coding transcripts, only 310 transcripts had 
considered as a non-coding potential, and any coding 
transcript in one of the two tools was removed. The 
non-coding transcripts had submitted to remove 
ORF = 300 aa using the TransDecoder tool (v5.5.0). 
We subjected 308 of the remaining non-coding 
transcripts to identify the novel IncRNAs 
(Alternative spliced) using NONCODE databases 
(v.5). At last, we introduced 139 potential novel 
IncRNAs. 


Interaction Between Novel IncRNAs and Target 
mRNA Genes 


For predicting the binding positions between the 


novel IncRNAs (Alternative spliced) and targeted 
mRNA genes, 139 sequences of novel IncRNAs 
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(Alternative spliced) (query) with 10921 sequences 
of mRNA genes (target) were interacted using 
RIsearch software (Wenzel et al., 2012) with a 
threshold of the base-pairing free energy no more 
than -50. Our findings demonstrated that 65 novel 
IncRNA transcripts had targeted 263 mRNA genes 
(Table S1). We have revealed that three novel 
IncRNA transcripts, TCONS_00041386, 
TCONS_00050059, and TCONS_00056428, had a 
strong relationship with targeted mRNA genes, and 
five novel IncRNA transcripts TCONS_00055761, 
TCONS_00055762, TCONS_00055763, 
TCONS_00055764, and TCONS_00055770 had 
created complicated network correlations with 
targeted mRNA genes. Moreover, we noticed that 
Six of novel IncRNA transcripts, 
TCONS_00050059, TCONS_00056428, 
TCONS_00055761, TCONS_00055762, 
TCONS_00055763, and TCONS_00055764, had 
more expression in the young than in the adult stages 
except for two transcripts, TCONS_00041386 and 
TCONS_00055770, which had less expression level 
at the early stage depending on values of transcript 
expression as fragments per kilobase per million 
(FPKM). 


Construction of ncRNA-mRNA Network 


All of 65 novel IncRNA transcripts were selected to 
construct the interaction network between novel 
IncRNAs and targeted mRNA _ genes using 
Cytoscape software v.3.7.2 (Shannon et al., 2003) 
(Figure $2). Our results from the regulatory network 
demonstrated that among 65 novel IncRNA 
transcripts, TCONS_00041386 connect with 48 
mRNA genes, TCONS_00050059 connect with 28 
mRNA genes, TCONS_00056428 connect with 20 
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mRNA genes, and 5 novel IncRNA transcripts 
(TCONS_00055761, TCONS_00055762, 
TCONS_00055763, TCONS_00055764 and 
TCONS_00055770) connect with 146 mRNA genes 
to construct complex network correlations. We 
noticed that the BEST2 and MFSD4B genes 
connection with 18,12 novel IncRNAs, respectively, 
to form a strong correlation. 


Discussion 


Accumulating evidence has indicated that long 
non-coding RNAs (IncRNAs) play vital roles in 
differentiation, development, and human disease by 
regulating gene expression (Saliani et al., 2021 and 
2022). The IncRNAs with multiple mechanisms 
have been classified into six main paradigms: R 
Loop, miRNA decoy or sponge, scaffolds, tripartite 
helix, stabilizing mRNA, and guides 
(Ramakrishnaiah et al., 2020). Increasing evidence 
indicates that IncRNAs can modulate nearly every 
cellular process through their association with 
mRNAs, DNAs, miRNAs, and proteins (Li et al., 
2019; Badday Betti et al., 2022). Therefore, binding 
position detections and interaction analysis between 
novel IncRNAs and targeted mRNA, Rlsearch 
software (Wenzel et al., 2012) were used in which 
free energy of base binding is no more than -50 


exon1 exon2 


exon3 
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(Yuan et al., 2020). In the present study, 
comprehensive RNA-seq analysis was used to 
identify the effect of novel IncRNAs (Alternative 
spliced) and their target mRNA longissimus dorsi 
muscle tissue samples of sheep in two functional 
stages. 


Alternative Splicing 


Y 


exoni = exon2 exon3 


IncRNA transcript 1 


exon1 exon2 


IncRNA transcript 2 


exon3 


exons 


exon4 


Figure 1. Schematic image of alternative splicing of IncRNA 
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The results implied the correlation between novel 
IncRNA transcripts with differential expression 
levels and muscle tissue in young and adult 
individuals. mRNA genes targeted by novel IncRNA 
transcripts extracted from the annotation GTF file 
for ovine muscle tissue were mentioned in many 
previous studies in which they play pivotal roles in 
skeletal muscle growth and development. Our results 
showed that 65 novel IncRNA transcripts had 
binding locations with 263 targeted mRNA genes 
(Yuan et al., 2020, Li et al., 2018 and 2019). Three 
novel IncRNA transcripts are strongly associated 


with their targeted genes. For example, 
TCONS_00041386 targeted 48 genes, 
TCONS_00050059 targeted 28 genes, and 


TCONS_00056428 targeted 20 genes. Clark et al. in 
2017 showed that only 31 of 48 genes, 14 of 28 
genes, and 9 of 20 genes, respectively, had shown 
differentially expressed in skeletal muscle between 
Texel (purebred) and T x BF (hybrid Texel x 
Scottish Blackface) (Clark et al., 2017). In addition, 
they reported that TCONS_00055769, novel 
IncRNA, targeted 29 genes and made a complex 
network, but only 19 out of 29 genes had appeared 
differentially expressed in skeletal muscle sheep 
between two breeds (Clark et al., 2017). In 
agreement with the previous study, 
TCONS_00047742, =TCONS_00035416, = and 
TCONS_00042104 (each separately) interacted with 
10 genes but only 5 of 10 genes, 2 of 10 genes, and 
3 of 10 genes, respectively, had demonstrated 
differentially expressed in sheep skeletal (Clark et 
al., 2017). TCONS_00042100, TCONS_00042102, 
and TCONS_00042103 interacted with the same 7 
genes, but only 2 of 7 genes had mentioned 
differentially expressed in sheep skeletal muscle 
(Clark et al., 2017). 


The other five IncRNA transcripts correlated with 
146 mRNA_ genes and created complicated 
networks. For instance, TCONS_00055761 and 
TCONS_00055763 interacted with the same 27 
genes, TCONS_00055762 interacted with 34 genes, 
TCONS_00055764 interacted with 30 genes, and 
TCONS_00055770 interacted with 28 genes, among 
them 18, 21, 18, and 17 genes, respectively also 
mentioned by Clark et al. to be differentially 
expressed in sheep skeletal muscle. Additionally, we 


observed some of these genes such as 
ENSOARG00000001608, targeted by 
TCONS_00055762, §=§=TCONS_00055770 and 
TCONS_00055764, ENSOARG00000016908 
targeted by TCONS_00055762 and 
TCONS_00055764, ECII, and 
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ENSOARGO0000015 179, ANPEP, 
ENSOARG00000018514 targeted by 
TCONS_00041386, NRAP targeted by 


TCONS_00056428 had higher expression levels in 


sheep skeletal muscle (fold change (FC) = 2) (Clark 


et al., 2017). We found the ABL2 gene targeted by 
novel transcript TCONS_00055764, also reported 
by Yuan et al., as one of the differential expression 
genes in the longissimus dorsi muscle of sheep 
(Yuan et al., 2020). HSP90AAI, CXCL14, and NRAP 
genes targeted by novel transcripts 
TCONS_00041386, § TCONS_00050059, = and 
TCONS_00056428, respectively. Lobjois et al. have 
mentioned HSP90AA/ gene was involved with the 
myogenic differentiation and cell proliferation in pig 
longissimus dorsi muscle (LDM) (Lobjois et al., 
2008). It was found that the CXCL/4 gene had a vital 
role in the cell differentiation of chicken muscles 
(Nihashi et al., 2019). Noce et al. have shown that 
the NRAP gene was expressed in five breeds of sheep 
longissimus dorsi muscle (LDM) (Noce et al., 2018). 
Other genes such as PCKI, IRX3, CCND3, 
COLGALT2, KCNNI, UBE2Q1, CCDC88C, and 
SLC25A13 have been targeted by some of the novel 
transcripts such as TCONS_00041386, 
TCONS_00050059, TCONS_00055762, 
TCONS_00055764, = TCONS_00055770, = and 
TCONS_00055761 reported with differential 
expression patterns in skeletal muscle and high and 
low production mutton sheep. 


Our study demonstrated six novel IncRNA 
transcripts, including TCONS_00050059, 
TCONS_00056428, TCONS_00055761, 
TCONS_00055762, =TCONS_00055763, = and 
TCONS_00055764, had high expression levels 
during the early stage compared with an adult. This 
result agrees with the article reported by Yuan et al. 
in 2020, which found there are time-specific 
IncRNA expressions through the pregnancy and 
after birth stages. Furthermore, they noticed those 
essential modifications in skeletal muscle 
development through gestation and newborn stages. 
In conclusion, we detected three novel IncRNA 
transcripts (TCONS_00041386, 
TCONS_00050059, and TCONS_00056428) had a 
strong relationship with targeted mRNA genes, and 
five novel IncRNA transcripts (TCONS_00055761, 
TCONS_00055762, TCONS_00055763, 
TCONS_00055764, and =TCONS_00055770) 
created complex network correlations with targeted 
mRNA genes. Six novel IncRNA transcripts 
(TCONS_00050059, TCONS_00056428, 
TCONS_00055761, TCONS_00055762, 
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TCONS_00055763, and TCONS_00055764) had 
high expression in the young stages than in the adult. 
Finally, our findings proposed that novel IncRNAs 
(Alternative spliced) may play a critical role in 
regulating sheep skeletal muscle growth and 
development and improving mutton sheep breeding 
programs. Moreover, it is important to note that 
finding exact regulatory functions of IncRNAs need 
more investigation and research. 
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Figure $1. Flow chart of the pipeline used to detect 
the novel IncRNAs (Alternative spliced) 


Table S1. The results of the interaction between 
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IncRNAs and their target mRNA genes; red arrows 
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